{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "6ece9c89",
   "metadata": {},
   "source": [
    "# Basic Tensor Operations\n",
    "\n",
    "This simple example demonstrates how to create various\n",
    "tensors and perform basic arithmetic operations using TT-NN, a\n",
    "high-level Python API. These operations include addition,\n",
    "multiplication, and matrix multiplication, and simulating\n",
    "broadcasting a row vector across a tile.\n",
    "\n",
    "Let's create the example file, `ttnn_basic_operations.py`\n",
    "\n",
    "## Import Libraries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9864d745",
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "import numpy as np\n",
    "import ttnn\n",
    "from loguru import logger"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d98f3e88",
   "metadata": {},
   "source": [
    "## Open the Device\n",
    "\n",
    "Create a device to run our program."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "64940eb5",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Open the Device\n",
    "device = ttnn.open_device(device_id=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "38c23802",
   "metadata": {},
   "source": [
    "## Host Tensor Creation\n",
    "\n",
    "Create a test tensor with different values that can\n",
    "demonstrate various operations. Learn more about Tensors [here](https://github.com/tenstorrent/tt-metal/blob/main/docs/source/ttnn/ttnn/tensor.rst)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2dff6e0e",
   "metadata": {},
   "outputs": [],
   "source": [
    "logger.info(\"\\n--- TT-NN Tensor Creation with Tiles (32x32) ---\")\n",
    "host_rand = torch.rand((32, 32), dtype=torch.float32)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ebc168d2",
   "metadata": {},
   "source": [
    "## Host Tensor Conversion and Creation\n",
    "\n",
    "Tensix cores operate most efficiently on tiled data, performing\n",
    "parallel computations. The host tensor is converted to a TT-NN tiled tensor or is created natively on the device.\n",
    "\n",
    "To convert PyTorch host tensors to TT-NN tiled tensors, use the following helper function: `to_tt_tile()`. \n",
    "This helper function creates a device tensor based on the `host_rand` PyTorch tensor."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "544b723b",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Helper to create a TT-NN tensor from torch with TILE_LAYOUT and bfloat16\n",
    "def to_tt_tile(torch_tensor):\n",
    "   return ttnn.from_torch(torch_tensor, dtype=ttnn.bfloat16, layout=ttnn.TILE_LAYOUT, device=device)\n",
    "\n",
    "tt_t1 = to_tt_tile(host_rand)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f37ac8de-2b08-44ab-8230-4a47e374827e",
   "metadata": {},
   "source": [
    "Alternatively, we can create and initialize tensors directly on the device \n",
    "using TT-NN's tensor creation functions. \n",
    "Creating tensors directly on the device is more efficient, \n",
    "it avoids the overhead of transfering data from the host to the device."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "90a85a47",
   "metadata": {},
   "outputs": [],
   "source": [
    "tt_t2 = ttnn.full(\n",
    "   shape=(32, 32),\n",
    "   fill_value=1.0,\n",
    "   dtype=ttnn.float32,\n",
    "   layout=ttnn.TILE_LAYOUT,\n",
    "   device=device,\n",
    ")\n",
    "tt_t3 = ttnn.zeros(\n",
    "   shape=(32, 32),\n",
    "   dtype=ttnn.bfloat16,\n",
    "   layout=ttnn.TILE_LAYOUT,\n",
    "   device=device,\n",
    ")\n",
    "tt_t4 = ttnn.ones(\n",
    "   shape=(32, 32),\n",
    "   dtype=ttnn.bfloat16,\n",
    "   layout=ttnn.TILE_LAYOUT,\n",
    "   device=device,\n",
    ")\n",
    "\n",
    "t5 = np.array([[5, 6], [7, 8]], dtype=np.float32).repeat(16, axis=0).repeat(16, axis=1)\n",
    "tt_t5 = ttnn.Tensor(t5, device=device, layout=ttnn.TILE_LAYOUT)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a9284b70",
   "metadata": {},
   "source": [
    "## Tile-Based Arithmetic Operations\n",
    "\n",
    "Tensors can perform the following arithmetic\n",
    "operations:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "682ab02f",
   "metadata": {},
   "outputs": [],
   "source": [
    "logger.info(\"\\n--- TT-NN Tensor Operations on (32x32) Tiles ---\")\n",
    "add_result = ttnn.add(tt_t1, tt_t4)\n",
    "logger.info(f\"Addition:\\n{add_result}\")\n",
    "\n",
    "mul_result = ttnn.mul(tt_t1, tt_t5)\n",
    "logger.info(f\"Element-wise Multiplication:\\n{mul_result}\")\n",
    "\n",
    "matmul_result = ttnn.matmul(tt_t4, tt_t1, memory_config=ttnn.DRAM_MEMORY_CONFIG)\n",
    "logger.info(f\"Matrix Multiplication:\\n{matmul_result}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e459c187-3ecc-4d2b-a74e-e664f59d7fce",
   "metadata": {},
   "source": [
    "## Simulated Broadcasting - Row Vector Expansion\n",
    "\n",
    "Let's simulate broadcasting a row vector across a tile. Every element of a given column will contain the same value.\n",
    "This is useful for operations that require expanding a smaller tensor to match the dimensions of a larger one.\n",
    "\n",
    "$$\n",
    "\\begin{bmatrix}\n",
    "1 & 2 & \\cdots & 30 & 31 \\\\\n",
    "\\end{bmatrix}\n",
    "\\rightarrow\n",
    "\\begin{bmatrix}\n",
    "1 & 2 & \\cdots & 30 & 31 \\\\\n",
    "1 & 2 & \\cdots & 30 & 31 \\\\\n",
    "\\cdots & \\cdots & \\cdots & \\cdots \\\\\n",
    "1 & 2 & \\cdots & 30 & 31 \\\\\n",
    "1 & 2 & \\cdots & 30 & 31 \\\\\n",
    "\\end{bmatrix}\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f9190899",
   "metadata": {},
   "outputs": [],
   "source": [
    "logger.info(\"\\n--- Simulated Broadcasting (32x32 + Broadcasted Row Vector) ---\")\n",
    "broadcast_vector = torch.tensor(np.arange(0, 32), dtype=torch.float32).repeat(32, 1)\n",
    "logger.info(f\"Broadcast Row Vector:\\n{broadcast_vector}\")\n",
    "\n",
    "broadcast_tt = to_tt_tile(broadcast_vector)\n",
    "broadcast_add_result = ttnn.add(tt_t1, broadcast_tt)\n",
    "logger.info(f\"Broadcast Add Result (TT-NN):\\n{ttnn.to_torch(broadcast_add_result)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3bc16d3a-aae6-4cc0-84c3-9f9d54b20104",
   "metadata": {},
   "source": [
    "## Close the Device"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ccd8eb7d-74fc-4cc4-9afa-90c831435820",
   "metadata": {},
   "outputs": [],
   "source": [
    "ttnn.close_device(device)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9f12c2a8",
   "metadata": {},
   "source": [
    "## Full Example and Output\n",
    "\n",
    "Lets put everything together in a complete example that can be run\n",
    "directly.\n",
    "\n",
    "[ttnn_basic_operations.py](https://github.com/tenstorrent/tt-metal/tree/main/ttnn/tutorials/basic_python/ttnn_basic_operations.py)\n",
    "\n",
    "Running this script will generate the following output:\n",
    "\n",
    "``` console\n",
    "$ python3 $TT_METAL_HOME/ttnn/tutorials/basic_python/ttnn_basic_operations.py\n",
    "2025-07-07 13:13:04.850 | info     |   SiliconDriver | Opened PCI device 7; KMD version: 1.34.0; API: 1; IOMMU: disabled (pci_device.cpp:198)\n",
    "2025-07-07 13:13:04.852 | info     |   SiliconDriver | Opened PCI device 7; KMD version: 1.34.0; API: 1; IOMMU: disabled (pci_device.cpp:198)\n",
    "2025-07-07 13:13:04.859 | info     |          Device | Opening user mode device driver (tt_cluster.cpp:190)\n",
    "2025-07-07 13:13:04.859 | info     |   SiliconDriver | Opened PCI device 7; KMD version: 1.34.0; API: 1; IOMMU: disabled (pci_device.cpp:198)\n",
    "2025-07-07 13:13:04.860 | info     |   SiliconDriver | Opened PCI device 7; KMD version: 1.34.0; API: 1; IOMMU: disabled (pci_device.cpp:198)\n",
    "2025-07-07 13:13:04.866 | info     |   SiliconDriver | Opened PCI device 7; KMD version: 1.34.0; API: 1; IOMMU: disabled (pci_device.cpp:198)\n",
    "2025-07-07 13:13:04.867 | info     |   SiliconDriver | Opened PCI device 7; KMD version: 1.34.0; API: 1; IOMMU: disabled (pci_device.cpp:198)\n",
    "2025-07-07 13:13:04.873 | info     |   SiliconDriver | Harvesting mask for chip 0 is 0x100 (NOC0: 0x100, simulated harvesting mask: 0x0). (cluster.cpp:282)\n",
    "2025-07-07 13:13:04.970 | info     |   SiliconDriver | Opened PCI device 7; KMD version: 1.34.0; API: 1; IOMMU: disabled (pci_device.cpp:198)\n",
    "2025-07-07 13:13:05.015 | info     |   SiliconDriver | Opening local chip ids/pci ids: {0}/[7] and remote chip ids {} (cluster.cpp:147)\n",
    "2025-07-07 13:13:05.025 | info     |   SiliconDriver | Software version 6.0.0, Ethernet FW version 6.14.0 (Device 0) (cluster.cpp:1039)\n",
    "2025-07-07 13:13:05.111 | info     |           Metal | AI CLK for device 0 is:   1000 MHz (metal_context.cpp:128)\n",
    "2025-07-07 13:13:05.678 | info     |           Metal | Initializing device 0. Program cache is enabled (device.cpp:428)\n",
    "2025-07-07 13:13:05.680 | warning  |           Metal | Unable to bind worker thread to CPU Core. May see performance degradation. Error Code: 22 (hardware_command_queue.cpp:74)\n",
    "2025-07-07 13:13:07.537 | INFO     | __main__:main:15 - \n",
    "--- TT-NN Tensor Creation with Tiles (32x32) ---\n",
    "2025-07-07 13:13:07.564 | INFO     | __main__:main:47 - \n",
    "--- TT-NN Tensor Operations on (32x32) Tiles ---\n",
    "2025-07-07 13:13:08.072 | INFO     | __main__:main:49 - Addition:\n",
    "ttnn.Tensor([[ 1.82812,  1.04688,  ...,  1.32812,  1.00781],\n",
    "             [ 1.39844,  1.03906,  ...,  1.14844,  1.24219],\n",
    "             ...,\n",
    "             [ 1.65625,  1.32812,  ...,  1.31250,  1.21094],\n",
    "             [ 1.21875,  1.33594,  ...,  1.37500,  1.62500]], shape=Shape([32, 32]), dtype=DataType::BFLOAT16, layout=Layout::TILE)\n",
    "2025-07-07 13:13:13.670 | INFO     | __main__:main:52 - Element-wise Multiplication:\n",
    "ttnn.Tensor([[ 4.12500,  0.23438,  ...,  1.96875,  0.02600],\n",
    "             [ 1.97656,  0.18164,  ...,  0.87891,  1.44531],\n",
    "             ...,\n",
    "             [ 4.59375,  2.31250,  ...,  2.48438,  1.65625],\n",
    "             [ 1.50781,  2.35938,  ...,  2.96875,  4.96875]], shape=Shape([32, 32]), dtype=DataType::BFLOAT16, layout=Layout::TILE)\n",
    "2025-07-07 13:13:14.229 | INFO     | __main__:main:55 - Matrix Multiplication:\n",
    "ttnn.Tensor([[16.50000, 14.25000,  ..., 15.56250, 14.43750],\n",
    "             [16.50000, 14.25000,  ..., 15.56250, 14.43750],\n",
    "             ...,\n",
    "             [16.50000, 14.25000,  ..., 15.56250, 14.43750],\n",
    "             [16.50000, 14.25000,  ..., 15.56250, 14.43750]], shape=Shape([32, 32]), dtype=DataType::BFLOAT16, layout=Layout::TILE)\n",
    "2025-07-07 13:13:14.229 | INFO     | __main__:main:57 - \n",
    "--- Simulated Broadcasting (32x32 + Broadcasted Row Vector) ---\n",
    "2025-07-07 13:13:14.231 | INFO     | __main__:main:59 - Broadcast Row Vector:\n",
    "tensor([[ 0.,  1.,  2.,  ..., 29., 30., 31.],\n",
    "        [ 0.,  1.,  2.,  ..., 29., 30., 31.],\n",
    "        [ 0.,  1.,  2.,  ..., 29., 30., 31.],\n",
    "        ...,\n",
    "        [ 0.,  1.,  2.,  ..., 29., 30., 31.],\n",
    "        [ 0.,  1.,  2.,  ..., 29., 30., 31.],\n",
    "        [ 0.,  1.,  2.,  ..., 29., 30., 31.]])\n",
    "2025-07-07 13:13:14.233 | INFO     | __main__:main:63 - Broadcast Add Result (TT-NN):\n",
    "tensor([[ 0.8242,  1.0469,  2.2500,  ..., 29.0000, 30.3750, 31.0000],\n",
    "        [ 0.3945,  1.0391,  2.5625,  ..., 29.1250, 30.1250, 31.2500],\n",
    "        [ 0.2188,  1.8750,  2.4375,  ..., 29.7500, 30.8750, 31.6250],\n",
    "        ...,\n",
    "        [ 0.7422,  1.1484,  2.9531,  ..., 29.1250, 30.5000, 31.1250],\n",
    "        [ 0.6562,  1.3281,  2.0938,  ..., 29.3750, 30.3750, 31.2500],\n",
    "        [ 0.2158,  1.3359,  2.8438,  ..., 29.2500, 30.3750, 31.6250]],\n",
    "       dtype=torch.bfloat16)\n",
    "2025-07-07 13:13:14.233 | info     |           Metal | Closing mesh device 1 (mesh_device.cpp:488)\n",
    "2025-07-07 13:13:14.234 | info     |           Metal | Closing mesh device 0 (mesh_device.cpp:488)\n",
    "2025-07-07 13:13:14.234 | info     |           Metal | Closing device 0 (device.cpp:468)\n",
    "2025-07-07 13:13:14.234 | info     |           Metal | Disabling and clearing program cache on device 0 (device.cpp:783)\n",
    "2025-07-07 13:13:14.234 | info     |           Metal | Closing mesh device 1 (mesh_device.cpp:488)\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e87bee85",
   "metadata": {},
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
